A Roadmap for a Rigorous Science of Interpretability
نویسندگان
چکیده
From autonomous cars and adaptive email-filters to predictive policing systems, machine learning (ML) systems are increasingly ubiquitous; they outperform humans on specific tasks [Mnih et al., 2013, Silver et al., 2016, Hamill, 2017] and often guide processes of human understanding and decisions [Carton et al., 2016, Doshi-Velez et al., 2014]. The deployment of ML systems in complex applications has led to a surge of interest in systems optimized not only for expected task performance but also other important criteria such as safety [Otte, 2013, Amodei et al., 2016, Varshney and Alemzadeh, 2016], nondiscrimination [Bostrom and Yudkowsky, 2014, Ruggieri et al., 2010, Hardt et al., 2016], avoiding technical debt [Sculley et al., 2015], or providing the right to explanation [Goodman and Flaxman, 2016]. For ML systems to be used safely, satisfying these auxiliary criteria is critical. However, unlike measures of performance such as accuracy, these criteria often cannot be completely quantified. For example, we might not be able to enumerate all unit tests required for the safe operation of a semi-autonomous car or all confounds that might cause a credit scoring system to be discriminatory. In such cases, a popular fallback is the criterion of interpretability : if the system can explain its reasoning, we then can verify whether that reasoning is sound with respect to these auxiliary criteria. Unfortunately, there is little consensus on what interpretability in machine learning is and how to evaluate it for benchmarking. Current interpretability evaluation typically falls into two categories. The first evaluates interpretability in the context of an application: if the system is useful in either a practical application or a simplified version of it, then it must be somehow interpretable (e.g. Ribeiro et al. [2016], Lei et al. [2016], Kim et al. [2015a], Doshi-Velez et al. [2015], Kim et al. [2015b]). The second evaluates interpretability via a quantifiable proxy: a researcher might first claim that some model class—e.g. sparse linear models, rule lists, gradient boosted trees—are interpretable and then present algorithms to optimize within that class (e.g. Bucilu et al. [2006], Wang et al. [2017], Wang and Rudin [2015], Lou et al. [2012]). To large extent, both evaluation approaches rely on some notion of “you’ll know it when you see it.” Should we be concerned about a lack of rigor? Yes and no: the notions of interpretability above appear reasonable because they are reasonable: they meet the first test of having facevalidity on the correct test set of subjects: human beings. However, this basic notion leaves many kinds of questions unanswerable: Are all models in all defined-to-be-interpretable model classes equally interpretable? Quantifiable proxies such as sparsity may seem to allow for comparison, but how does one think about comparing a model sparse in features to a model sparse in prototypes? Moreover, do all applications have the same interpretability needs? If we are to move this field forward—to compare methods and understand when methods may generalize—we need to formalize these notions and make them evidence-based. The objective of this review is to chart a path toward the definition and rigorous evaluation of interpretability. The need is urgent: recent European Union regulation will require algorithms
منابع مشابه
A New Model Representation for Road Mapping in Emerging Sciences: A Case Study on Roadmap of Quantum Computing
One of the solutions for organizations to succeed in highly competitive markets is to move toward emerging sciences. These areas provide many opportunities, but, if organizations do not meet requirements of emerging sciences, they may fail and eventually, may enter a crisis. In this matter, one of the important requirements is to develop suitable roadmaps in variety fields such as strategic, ca...
متن کاملScience and technology roadmapping in AJA University of medical sciences
Background: The great scientific Jihad that has been launched in our country by the order of the Supreme leader primarily requires a roadmap and a strategic plan. According to the significant place for science and technology in the 2025 perspective (20-years national master-plan) for the Iranian armed force (AJA), achieving the highest rank of health and the best scientific reputation in m...
متن کاملRoadmap to Stop the Predatory Journals: Author\'s Perspective
Recent disgracing reports are warning the scientific communities to think more about the solutions to win the battle against predatory journals and publishers. Current integrity and accuracy in science is a result of decades of honest works and publications which are an asset, now everyone as stakeholders of science should feel the responsibility to sustain its high privileged level. The ethica...
متن کاملAn essay on accessing the brownfields redevelopment roadmap appropriate with Iran\'s condition
The concept of redevelopment is accompanied by actions and forecasts to improve the quality of the physical-spatial environment inthe cities, that is by the emergence of new facilities and conditions, improving the spatial environment is achieved. This requirement occurs when the coherence, coordination and the performance of the urban area is diminishing and is not responsive to the requiremen...
متن کاملMethod integration: An approach to develop agent oriented methodologies
Agent oriented software engineering (AOSE) is an emerging field in computer science and proposes some systematic ideas for multi agent systems analysis, implementation and maintenance. Despite the various methodologies introduced in the agent-oriented software engineering, the main challenges are defects in different aspects of methodologies. According to the defects resulted from weaknesses ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1702.08608 شماره
صفحات -
تاریخ انتشار 2017